Background: Fusion transcripts are formed by combining exons from two different genes, often due to structural rearrangements such as deletions, inversions or translocations (genomic rearrangement-dependent, GRD) or through aberrant splicing (genomic rearrangement-independent, GRI). In hematological malignancies, many fusion transcripts act as driver events, playing crucial roles in leukemogenesis and serving as diagnostic markers, such as BCR::ABL1. However, the role and diagnostic significance of other fusion transcripts, like P2RY8::CD99—which has been found in both healthy samples and B-ALL patients—are less clear. Similarly, the clinical relevance of other GRI fusion transcripts, such as SEMA6A::FEM1C, often remains uncertain. Single-cell analysis is a promising approach to detect these fusion transcripts within individual cells and cell populations, shedding light on the subclonal architecture and cells of origin.

Aim: Evaluating the feasibility of integrating long-read sequencing with single-cell library preparation to detect and characterize GRD and GRI fusion transcripts.

Patients and Methods: Our cohort consisted of 10 samples diagnosed with B-ALL, featuring various fusion transcripts: BCR::ABL1 (n = 6, GRD), SEMA6A::FEM1C (n = 10, GRI), EBF1::PDGFRB (n = 1, GRD), and P2RY8::CD99 (n = 1, GRI), as identified through bulk whole transcriptome analysis. Cryopreserved cells were processed using the GEM-X Universal 3' Expression Library Prep Kit (10x Genomics) and GEM-X cDNAs with cell-specific labels served as input for the Kinnex single-cell RNA kit (PacBio). GEM-X libraries were sequenced on the NovaSeqX instrument (Illumina) with a median depth of 21,565 reads per cell. Kinnex libraries on the Revio (PacBio) with 3,767,066 mean HiFi reads per sample and a mean HiFi read length of 14.2kb. GEM-X libraries were analyzed using cellranger (v9.0.1) and seurat (v5.2.1). Kinnex libraries were preprocessed with the Iso-Seq workflow (PacBio) and fusion transcripts were called with pbfusion.

Results: Cells from different samples were merged and clustered based on their gene expression profiles from GEM-X libraries. Cell clusters were annotated as B-cells based on CD10, CD79A, and PAX5 expression; T-cells based on CD3 expression; hematopoietic stem cells as CD34+; and myeloid cells by the absence of lymphatic markers and the presence of CD14, FCER1G, and CEBPD. Within the B-cell population, three distinct subpopulations corresponding to various B-cell developmental states were identified: pro-B, pre-proB, and pro-B VDJ. The pro-B VDJ state, characterized by heavy chain rearrangement, showed increased CD20 expression. Furthermore, cell cycle analysis using 100 genes associated with the S-phase/G2M-phase revealed a subpopulation of cycling pro-B cells marked by high levels of MKI67 (G2M-phase) and MCM4 (S-phase). Our fusion calling pipeline successfully detected BCR::ABL1, SEMA6A::FEM1C, EBF1::PDGFRB, and P2RY8::CD99 fusion transcripts within the long-read dataset, without any false positive calls at the sample level. These fusion transcripts were found exclusively in B precursor cells when mapped to the cellular landscape. Notably, cells in the pro-B VDJ state did not harbour any fusion transcripts. Interestingly, P2RY8::CD99 was identified only in B-cells, not in T-cells or myeloid progenitors, suggesting its association with pathogenic cells. Similarly, SEMA6A::FEM1C, a GDI fusion transcript, was confined to B precursor cells, indicating an association with the B-ALL clone.

Conclusions: We have demonstrated the feasibility and utility of integrating long-read isoform sequencing with single-cell library preparation for detecting fusion transcripts of diverse origins. This transcriptome-wide approach enables disease-agnostic detection and characterization of fusion transcripts, not only those arising from chromosomal aberrations (GRD) but also those resulting from aberrant splicing (GRI) at the individual cell level. Moreover, this method facilitates a comprehensive analysis of cell-specific gene expression profiles and fusion transcripts and could be expanded to include the detection of single-nucleotide variants and copy number changes, thereby completing the molecular profile. While currently not designed for integration into routine workflows, this method represents a valuable tool for research projects aimed at elucidating the molecular mechanisms underlying various diseases, with the goal to enhance patient care.

This content is only available as a PDF.
Sign in via your Institution